Static Analysis Meets Distributed Fault-Tolerance: Enabling State-Machine Replication with Nondeterminism

نویسندگان

  • Joseph G. Slember
  • Priya Narasimhan
چکیده

Midas is an inter-disciplinary approach to supporting state-machine replication for nondeterministic distributed applications. The approach exploits compile-time static analysis to identify both first-hand and second-hand sources of nondeterminism. Subsequent runtime compensation occurs through either the transfer of nondeterministic checkpoints or the reexecution of inserted code, and restores consistency among replicas before each new client request. The approach avoids the need for lock-step synchronization and leverages application-level insight to address only the nondeterminism that matters. Our preliminary evaluation demonstrates Midas’ feasibility and current performance overheads.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Living with Nondeterminism in Replicated Middleware Applications

Application-level nondeterminism can lead to inconsistent state that defeats the purpose of replication as a fault-tolerance strategy. We present Midas, a new approach for living with nondeterminism in distributed, replicated, middleware applications. Midas exploits (i) the static program analysis of the application’s source code prior to replica deployment and (ii) the online compensation of r...

متن کامل

Byzantine Fault Tolerant Execution of Long-running Distributed Applications

Long-running distributed applications that automate critical decision processes require Byzantine fault tolerance to ensure progress in spite of arbitrary failures. Existing replication protocols for data servers guarantee that externally requested operations execute correctly even if a bounded number of replicas fail arbitrarily. However, since these protocols only support passive state machin...

متن کامل

The State Machine Approach: A Tutorial

The state machine approach is a general method for achieving fault tolerance and implementing decentralized control in distributed systems. This paper reviews the approach and identifies abstractions needed for coordinating ensembles of state machines. Implementations of these abstractions for two different failure models Byzantine and fail-stolr--are discussed. The state machine approach is il...

متن کامل

A Guided Tour on the Theory and Practice of State Machine Replication

This chapter presents the fundamentals and applications of the State Machine Replication (SMR) technique for implementing consistent fault-tolerant services. Our focus here is threefold. First we present some fundamentals about distributed computing and three “practical” SMR protocols for different fault models. Second, we discuss some recent work aiming to improve the performance, modularity a...

متن کامل

Improving the palbimm scheduling algorithm for fault tolerance in cloud computing

Cloud computing is the latest technology that involves distributed computation over the Internet. It meets the needs of users through sharing resources and using virtual technology. The workflow user applications refer to a set of tasks to be processed within the cloud environment. Scheduling algorithms have a lot to do with the efficiency of cloud computing environments through selection of su...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006